Overview

Dataset statistics

Number of variables23
Number of observations266824
Missing cells1226294
Missing cells (%)20.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory46.8 MiB
Average record size in memory184.0 B

Variable types

Numeric8
Categorical15

Warnings

sueldo_smdlv is highly correlated with otros_ingresos_smdlv and 1 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
año_credito is highly correlated with sueldo_smdlvHigh correlation
sueldo_smdlv is highly correlated with otros_ingresos_smdlv and 1 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
año_credito is highly correlated with sueldo_smdlvHigh correlation
municipio_residencia is highly correlated with municipio_credito and 2 other fieldsHigh correlation
periodo_credito is highly correlated with estado_final and 2 other fieldsHigh correlation
estado_final is highly correlated with periodo_credito and 1 other fieldsHigh correlation
sueldo_smdlv is highly correlated with otros_ingresos_smdlvHigh correlation
municipio_credito is highly correlated with municipio_residencia and 2 other fieldsHigh correlation
municipio_expedicion is highly correlated with municipio_residencia and 2 other fieldsHigh correlation
cuotas is highly correlated with forma_pagoHigh correlation
genero is highly correlated with RowHigh correlation
forma_pago is highly correlated with estado_final and 3 other fieldsHigh correlation
año_credito is highly correlated with periodo_credito and 2 other fieldsHigh correlation
Row is highly correlated with periodo_credito and 4 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
municipio_nacimiento is highly correlated with municipio_residencia and 3 other fieldsHigh correlation
municipio_expedicion is highly correlated with municipio_nacimientoHigh correlation
tipo_persona is highly correlated with generoHigh correlation
municipio_credito is highly correlated with municipio_residenciaHigh correlation
municipio_residencia is highly correlated with municipio_creditoHigh correlation
genero is highly correlated with tipo_personaHigh correlation
forma_pago is highly correlated with estado_finalHigh correlation
estado_final is highly correlated with forma_pagoHigh correlation
municipio_nacimiento is highly correlated with municipio_expedicionHigh correlation
genero has 151596 (56.8%) missing values Missing
estado_civil has 157960 (59.2%) missing values Missing
edad has 168709 (63.2%) missing values Missing
municipio_nacimiento has 11704 (4.4%) missing values Missing
municipio_expedicion has 142399 (53.4%) missing values Missing
tiene_casa_propia has 158961 (59.6%) missing values Missing
sueldo_smdlv has 174837 (65.5%) missing values Missing
otros_ingresos_smdlv has 260110 (97.5%) missing values Missing
Row is uniformly distributed Uniform
Row has unique values Unique

Reproduction

Analysis started2021-05-12 19:52:36.338663
Analysis finished2021-05-12 19:53:51.288029
Duration1 minute and 14.95 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Row
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct266824
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean133412.5
Minimum1
Maximum266824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum1
5-th percentile13342.15
Q166706.75
median133412.5
Q3200118.25
95-th percentile253482.85
Maximum266824
Range266823
Interquartile range (IQR)133411.5

Descriptive statistics

Standard deviation77025.59845
Coefficient of variation (CV)0.5773491873
Kurtosis-1.2
Mean133412.5
Median Absolute Deviation (MAD)66706
Skewness-1.074941117 × 10-15
Sum3.55976569 × 1010
Variance5932942817
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
927921
 
< 0.1%
723101
 
< 0.1%
661651
 
< 0.1%
682121
 
< 0.1%
784511
 
< 0.1%
804981
 
< 0.1%
743531
 
< 0.1%
764001
 
< 0.1%
1194071
 
< 0.1%
Other values (266814)266814
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
ValueCountFrequency (%)
2668241
< 0.1%
2668231
< 0.1%
2668221
< 0.1%
2668211
< 0.1%
2668201
< 0.1%
2668191
< 0.1%
2668181
< 0.1%
2668171
< 0.1%
2668161
< 0.1%
2668151
< 0.1%

procedencia
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
Nacional
266101 
Extranjero
 
723

Length

Max length10
Median length8
Mean length8.005419303
Min length8

Characters and Unicode

Total characters2136038
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNacional
2nd rowNacional
3rd rowNacional
4th rowNacional
5th rowNacional

Common Values

ValueCountFrequency (%)
Nacional266101
99.7%
Extranjero723
 
0.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
nacional266101
99.7%
extranjero723
 
0.3%

Most occurring characters

ValueCountFrequency (%)
a532925
24.9%
o266824
12.5%
n266824
12.5%
N266101
12.5%
c266101
12.5%
i266101
12.5%
l266101
12.5%
r1446
 
0.1%
E723
 
< 0.1%
x723
 
< 0.1%
Other values (3)2169
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1869214
87.5%
Uppercase Letter266824
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a532925
28.5%
o266824
14.3%
n266824
14.3%
c266101
14.2%
i266101
14.2%
l266101
14.2%
r1446
 
0.1%
x723
 
< 0.1%
t723
 
< 0.1%
j723
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N266101
99.7%
E723
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin2136038
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a532925
24.9%
o266824
12.5%
n266824
12.5%
N266101
12.5%
c266101
12.5%
i266101
12.5%
l266101
12.5%
r1446
 
0.1%
E723
 
< 0.1%
x723
 
< 0.1%
Other values (3)2169
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2136038
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a532925
24.9%
o266824
12.5%
n266824
12.5%
N266101
12.5%
c266101
12.5%
i266101
12.5%
l266101
12.5%
r1446
 
0.1%
E723
 
< 0.1%
x723
 
< 0.1%
Other values (3)2169
 
0.1%

genero
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing151596
Missing (%)56.8%
Memory size2.0 MiB
Femenino
61702 
Masculino
53526 

Length

Max length9
Median length8
Mean length8.464522512
Min length8

Characters and Unicode

Total characters975350
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemenino
2nd rowFemenino
3rd rowFemenino
4th rowFemenino
5th rowFemenino

Common Values

ValueCountFrequency (%)
Femenino61702
23.1%
Masculino53526
 
20.1%
(Missing)151596
56.8%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
femenino61702
53.5%
masculino53526
46.5%

Most occurring characters

ValueCountFrequency (%)
n176930
18.1%
e123404
12.7%
i115228
11.8%
o115228
11.8%
F61702
 
6.3%
m61702
 
6.3%
M53526
 
5.5%
a53526
 
5.5%
s53526
 
5.5%
c53526
 
5.5%
Other values (2)107052
11.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter860122
88.2%
Uppercase Letter115228
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n176930
20.6%
e123404
14.3%
i115228
13.4%
o115228
13.4%
m61702
 
7.2%
a53526
 
6.2%
s53526
 
6.2%
c53526
 
6.2%
u53526
 
6.2%
l53526
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
F61702
53.5%
M53526
46.5%

Most occurring scripts

ValueCountFrequency (%)
Latin975350
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n176930
18.1%
e123404
12.7%
i115228
11.8%
o115228
11.8%
F61702
 
6.3%
m61702
 
6.3%
M53526
 
5.5%
a53526
 
5.5%
s53526
 
5.5%
c53526
 
5.5%
Other values (2)107052
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII975350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n176930
18.1%
e123404
12.7%
i115228
11.8%
o115228
11.8%
F61702
 
6.3%
m61702
 
6.3%
M53526
 
5.5%
a53526
 
5.5%
s53526
 
5.5%
c53526
 
5.5%
Other values (2)107052
11.0%

estado_civil
Categorical

MISSING

Distinct5
Distinct (%)< 0.1%
Missing157960
Missing (%)59.2%
Memory size2.0 MiB
Union Libre
41320 
Soltero
33474 
Casado
32484 
Viudo
 
1006
Divorciado
 
580

Length

Max length11
Median length7
Mean length8.217335391
Min length5

Characters and Unicode

Total characters894572
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnion Libre
2nd rowUnion Libre
3rd rowUnion Libre
4th rowUnion Libre
5th rowSoltero

Common Values

ValueCountFrequency (%)
Union Libre41320
 
15.5%
Soltero33474
 
12.5%
Casado32484
 
12.2%
Viudo1006
 
0.4%
Divorciado580
 
0.2%
(Missing)157960
59.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
union41320
27.5%
libre41320
27.5%
soltero33474
22.3%
casado32484
21.6%
viudo1006
 
0.7%
divorciado580
 
0.4%

Most occurring characters

ValueCountFrequency (%)
o142918
16.0%
i84806
9.5%
n82640
9.2%
r75374
 
8.4%
e74794
 
8.4%
a65548
 
7.3%
U41320
 
4.6%
41320
 
4.6%
L41320
 
4.6%
b41320
 
4.6%
Other values (11)203212
22.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter703068
78.6%
Uppercase Letter150184
 
16.8%
Space Separator41320
 
4.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o142918
20.3%
i84806
12.1%
n82640
11.8%
r75374
10.7%
e74794
10.6%
a65548
9.3%
b41320
 
5.9%
d34070
 
4.8%
l33474
 
4.8%
t33474
 
4.8%
Other values (4)34650
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
U41320
27.5%
L41320
27.5%
S33474
22.3%
C32484
21.6%
V1006
 
0.7%
D580
 
0.4%
Space Separator
ValueCountFrequency (%)
41320
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin853252
95.4%
Common41320
 
4.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o142918
16.7%
i84806
9.9%
n82640
9.7%
r75374
8.8%
e74794
8.8%
a65548
 
7.7%
U41320
 
4.8%
L41320
 
4.8%
b41320
 
4.8%
d34070
 
4.0%
Other values (10)169142
19.8%
Common
ValueCountFrequency (%)
41320
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII894572
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o142918
16.0%
i84806
9.5%
n82640
9.2%
r75374
 
8.4%
e74794
 
8.4%
a65548
 
7.3%
U41320
 
4.6%
41320
 
4.6%
L41320
 
4.6%
b41320
 
4.6%
Other values (11)203212
22.7%

edad
Real number (ℝ≥0)

MISSING

Distinct74
Distinct (%)0.1%
Missing168709
Missing (%)63.2%
Infinite0
Infinite (%)0.0%
Mean44.13916323
Minimum18
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum18
5-th percentile26
Q135
median43
Q352
95-th percentile64
Maximum98
Range80
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.92553382
Coefficient of variation (CV)0.270180333
Kurtosis-0.3887701935
Mean44.13916323
Median Absolute Deviation (MAD)9
Skewness0.3365665568
Sum4330714
Variance142.2183569
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
443614
 
1.4%
403359
 
1.3%
413123
 
1.2%
453111
 
1.2%
492996
 
1.1%
352994
 
1.1%
462883
 
1.1%
312843
 
1.1%
362759
 
1.0%
372757
 
1.0%
Other values (64)67676
25.4%
(Missing)168709
63.2%
ValueCountFrequency (%)
1812
 
< 0.1%
1972
 
< 0.1%
20222
 
0.1%
21326
 
0.1%
22465
 
0.2%
23653
 
0.2%
24784
0.3%
251195
0.4%
261476
0.6%
271659
0.6%
ValueCountFrequency (%)
982
 
< 0.1%
903
 
< 0.1%
891
 
< 0.1%
882
 
< 0.1%
871
 
< 0.1%
861
 
< 0.1%
856
 
< 0.1%
8459
< 0.1%
8350
< 0.1%
8227
< 0.1%

municipio_residencia
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
ARAUCA
130507 
TAME
69756 
SARAVENA
35432 
Otros
16755 
ARAUQUITA
14374 

Length

Max length9
Median length6
Mean length5.841539742
Min length4

Characters and Unicode

Total characters1558663
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSARAVENA
2nd rowARAUCA
3rd rowARAUQUITA
4th rowARAUCA
5th rowARAUQUITA

Common Values

ValueCountFrequency (%)
ARAUCA130507
48.9%
TAME69756
26.1%
SARAVENA35432
 
13.3%
Otros16755
 
6.3%
ARAUQUITA14374
 
5.4%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca130507
48.9%
tame69756
26.1%
saravena35432
 
13.3%
otros16755
 
6.3%
arauquita14374
 
5.4%

Most occurring characters

ValueCountFrequency (%)
A610695
39.2%
R180313
 
11.6%
U159255
 
10.2%
C130507
 
8.4%
E105188
 
6.7%
T84130
 
5.4%
M69756
 
4.5%
S35432
 
2.3%
V35432
 
2.3%
N35432
 
2.3%
Other values (7)112523
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1491643
95.7%
Lowercase Letter67020
 
4.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A610695
40.9%
R180313
 
12.1%
U159255
 
10.7%
C130507
 
8.7%
E105188
 
7.1%
T84130
 
5.6%
M69756
 
4.7%
S35432
 
2.4%
V35432
 
2.4%
N35432
 
2.4%
Other values (3)45503
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
t16755
25.0%
r16755
25.0%
o16755
25.0%
s16755
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1558663
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A610695
39.2%
R180313
 
11.6%
U159255
 
10.2%
C130507
 
8.4%
E105188
 
6.7%
T84130
 
5.4%
M69756
 
4.5%
S35432
 
2.3%
V35432
 
2.3%
N35432
 
2.3%
Other values (7)112523
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1558663
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A610695
39.2%
R180313
 
11.6%
U159255
 
10.2%
C130507
 
8.4%
E105188
 
6.7%
T84130
 
5.4%
M69756
 
4.5%
S35432
 
2.3%
V35432
 
2.3%
N35432
 
2.3%
Other values (7)112523
 
7.2%

municipio_nacimiento
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing11704
Missing (%)4.4%
Memory size2.0 MiB
ARAUCA
171467 
Otros
40864 
TAME
24589 
ARAUQUITA
 
10188
SARAVENA
 
8012

Length

Max length9
Median length6
Mean length5.829672311
Min length4

Characters and Unicode

Total characters1487266
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOtros
2nd rowARAUQUITA
3rd rowARAUQUITA
4th rowARAUQUITA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA171467
64.3%
Otros40864
 
15.3%
TAME24589
 
9.2%
ARAUQUITA10188
 
3.8%
SARAVENA8012
 
3.0%
(Missing)11704
 
4.4%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca171467
67.2%
otros40864
 
16.0%
tame24589
 
9.6%
arauquita10188
 
4.0%
saravena8012
 
3.1%

Most occurring characters

ValueCountFrequency (%)
A593590
39.9%
U191843
 
12.9%
R189667
 
12.8%
C171467
 
11.5%
O40864
 
2.7%
t40864
 
2.7%
r40864
 
2.7%
o40864
 
2.7%
s40864
 
2.7%
T34777
 
2.3%
Other values (7)101602
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1323810
89.0%
Lowercase Letter163456
 
11.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A593590
44.8%
U191843
 
14.5%
R189667
 
14.3%
C171467
 
13.0%
O40864
 
3.1%
T34777
 
2.6%
E32601
 
2.5%
M24589
 
1.9%
Q10188
 
0.8%
I10188
 
0.8%
Other values (3)24036
 
1.8%
Lowercase Letter
ValueCountFrequency (%)
t40864
25.0%
r40864
25.0%
o40864
25.0%
s40864
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1487266
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A593590
39.9%
U191843
 
12.9%
R189667
 
12.8%
C171467
 
11.5%
O40864
 
2.7%
t40864
 
2.7%
r40864
 
2.7%
o40864
 
2.7%
s40864
 
2.7%
T34777
 
2.3%
Other values (7)101602
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1487266
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A593590
39.9%
U191843
 
12.9%
R189667
 
12.8%
C171467
 
11.5%
O40864
 
2.7%
t40864
 
2.7%
r40864
 
2.7%
o40864
 
2.7%
s40864
 
2.7%
T34777
 
2.3%
Other values (7)101602
 
6.8%

municipio_expedicion
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing142399
Missing (%)53.4%
Memory size2.0 MiB
ARAUCA
55057 
Otros
39115 
TAME
18371 
ARAUQUITA
7673 
SARAVENA
 
4209

Length

Max length9
Median length6
Mean length5.64299779
Min length4

Characters and Unicode

Total characters702130
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOtros
2nd rowARAUCA
3rd rowARAUQUITA
4th rowARAUQUITA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA55057
 
20.6%
Otros39115
 
14.7%
TAME18371
 
6.9%
ARAUQUITA7673
 
2.9%
SARAVENA4209
 
1.6%
(Missing)142399
53.4%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca55057
44.2%
otros39115
31.4%
tame18371
 
14.8%
arauquita7673
 
6.2%
saravena4209
 
3.4%

Most occurring characters

ValueCountFrequency (%)
A219188
31.2%
U70403
 
10.0%
R66939
 
9.5%
C55057
 
7.8%
O39115
 
5.6%
t39115
 
5.6%
r39115
 
5.6%
o39115
 
5.6%
s39115
 
5.6%
T26044
 
3.7%
Other values (7)68924
 
9.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter545670
77.7%
Lowercase Letter156460
 
22.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A219188
40.2%
U70403
 
12.9%
R66939
 
12.3%
C55057
 
10.1%
O39115
 
7.2%
T26044
 
4.8%
E22580
 
4.1%
M18371
 
3.4%
Q7673
 
1.4%
I7673
 
1.4%
Other values (3)12627
 
2.3%
Lowercase Letter
ValueCountFrequency (%)
t39115
25.0%
r39115
25.0%
o39115
25.0%
s39115
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin702130
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A219188
31.2%
U70403
 
10.0%
R66939
 
9.5%
C55057
 
7.8%
O39115
 
5.6%
t39115
 
5.6%
r39115
 
5.6%
o39115
 
5.6%
s39115
 
5.6%
T26044
 
3.7%
Other values (7)68924
 
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII702130
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A219188
31.2%
U70403
 
10.0%
R66939
 
9.5%
C55057
 
7.8%
O39115
 
5.6%
t39115
 
5.6%
r39115
 
5.6%
o39115
 
5.6%
s39115
 
5.6%
T26044
 
3.7%
Other values (7)68924
 
9.8%

tipo_persona
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
Natural
261755 
Juridica
 
5069

Length

Max length8
Median length7
Mean length7.018997541
Min length7

Characters and Unicode

Total characters1872837
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNatural
2nd rowNatural
3rd rowNatural
4th rowNatural
5th rowNatural

Common Values

ValueCountFrequency (%)
Natural261755
98.1%
Juridica5069
 
1.9%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
natural261755
98.1%
juridica5069
 
1.9%

Most occurring characters

ValueCountFrequency (%)
a528579
28.2%
u266824
14.2%
r266824
14.2%
N261755
14.0%
t261755
14.0%
l261755
14.0%
i10138
 
0.5%
J5069
 
0.3%
d5069
 
0.3%
c5069
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1606013
85.8%
Uppercase Letter266824
 
14.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a528579
32.9%
u266824
16.6%
r266824
16.6%
t261755
16.3%
l261755
16.3%
i10138
 
0.6%
d5069
 
0.3%
c5069
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
N261755
98.1%
J5069
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Latin1872837
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a528579
28.2%
u266824
14.2%
r266824
14.2%
N261755
14.0%
t261755
14.0%
l261755
14.0%
i10138
 
0.5%
J5069
 
0.3%
d5069
 
0.3%
c5069
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1872837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a528579
28.2%
u266824
14.2%
r266824
14.2%
N261755
14.0%
t261755
14.0%
l261755
14.0%
i10138
 
0.5%
J5069
 
0.3%
d5069
 
0.3%
c5069
 
0.3%

tiene_casa_propia
Categorical

MISSING

Distinct2
Distinct (%)< 0.1%
Missing158961
Missing (%)59.6%
Memory size2.0 MiB
Si
78054 
No
29809 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters215726
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSi
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
Si78054
29.3%
No29809
 
11.2%
(Missing)158961
59.6%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
si78054
72.4%
no29809
 
27.6%

Most occurring characters

ValueCountFrequency (%)
S78054
36.2%
i78054
36.2%
N29809
 
13.8%
o29809
 
13.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter107863
50.0%
Lowercase Letter107863
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S78054
72.4%
N29809
 
27.6%
Lowercase Letter
ValueCountFrequency (%)
i78054
72.4%
o29809
 
27.6%

Most occurring scripts

ValueCountFrequency (%)
Latin215726
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S78054
36.2%
i78054
36.2%
N29809
 
13.8%
o29809
 
13.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII215726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S78054
36.2%
i78054
36.2%
N29809
 
13.8%
o29809
 
13.8%

sueldo_smdlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct524
Distinct (%)0.6%
Missing174837
Missing (%)65.5%
Infinite0
Infinite (%)0.0%
Mean103.2626784
Minimum3
Maximum600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum3
5-th percentile27
Q143
median73
Q3130
95-th percentile286
Maximum600
Range597
Interquartile range (IQR)87

Descriptive statistics

Standard deviation90.27400954
Coefficient of variation (CV)0.8742171995
Kurtosis7.070204741
Mean103.2626784
Median Absolute Deviation (MAD)35
Skewness2.324892561
Sum9498824
Variance8149.396798
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
342384
 
0.9%
361895
 
0.7%
431869
 
0.7%
611650
 
0.6%
721641
 
0.6%
301616
 
0.6%
681587
 
0.6%
461483
 
0.6%
761448
 
0.5%
651415
 
0.5%
Other values (514)74999
28.1%
(Missing)174837
65.5%
ValueCountFrequency (%)
313
 
< 0.1%
43
 
< 0.1%
519
 
< 0.1%
631
 
< 0.1%
760
 
< 0.1%
825
 
< 0.1%
937
 
< 0.1%
10205
0.1%
1155
 
< 0.1%
1247
 
< 0.1%
ValueCountFrequency (%)
600388
0.1%
5992
 
< 0.1%
5971
 
< 0.1%
5941
 
< 0.1%
5931
 
< 0.1%
58853
 
< 0.1%
5866
 
< 0.1%
5847
 
< 0.1%
58215
 
< 0.1%
5781
 
< 0.1%

otros_ingresos_smdlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct222
Distinct (%)3.3%
Missing260110
Missing (%)97.5%
Infinite0
Infinite (%)0.0%
Mean55.00417039
Minimum0
Maximum300
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile7
Q119
median34
Q368
95-th percentile182.7
Maximum300
Range300
Interquartile range (IQR)49

Descriptive statistics

Standard deviation58.67205411
Coefficient of variation (CV)1.066683739
Kurtosis5.815075647
Mean55.00417039
Median Absolute Deviation (MAD)18
Skewness2.338335015
Sum369298
Variance3442.409933
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34227
 
0.1%
17196
 
0.1%
18193
 
0.1%
36191
 
0.1%
20157
 
0.1%
23156
 
0.1%
13154
 
0.1%
30143
 
0.1%
21142
 
0.1%
26131
 
< 0.1%
Other values (212)5024
 
1.9%
(Missing)260110
97.5%
ValueCountFrequency (%)
02
 
< 0.1%
19
 
< 0.1%
21
 
< 0.1%
364
< 0.1%
445
 
< 0.1%
540
 
< 0.1%
694
< 0.1%
7119
< 0.1%
872
< 0.1%
957
< 0.1%
ValueCountFrequency (%)
300109
< 0.1%
2948
 
< 0.1%
29111
 
< 0.1%
2891
 
< 0.1%
2883
 
< 0.1%
2802
 
< 0.1%
2792
 
< 0.1%
2761
 
< 0.1%
2717
 
< 0.1%
2683
 
< 0.1%

municipio_credito
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
ARAUCA
127641 
TAME
63795 
SARAVENA
29609 
ARAUQUITA
23431 
Otros
22348 

Length

Max length9
Median length6
Mean length5.923443918
Min length4

Characters and Unicode

Total characters1580517
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSARAVENA
2nd rowARAUCA
3rd rowARAUQUITA
4th rowOtros
5th rowARAUQUITA

Common Values

ValueCountFrequency (%)
ARAUCA127641
47.8%
TAME63795
23.9%
SARAVENA29609
 
11.1%
ARAUQUITA23431
 
8.8%
Otros22348
 
8.4%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca127641
47.8%
tame63795
23.9%
saravena29609
 
11.1%
arauquita23431
 
8.8%
otros22348
 
8.4%

Most occurring characters

ValueCountFrequency (%)
A605838
38.3%
R180681
 
11.4%
U174503
 
11.0%
C127641
 
8.1%
E93404
 
5.9%
T87226
 
5.5%
M63795
 
4.0%
S29609
 
1.9%
V29609
 
1.9%
N29609
 
1.9%
Other values (7)158602
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1491125
94.3%
Lowercase Letter89392
 
5.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A605838
40.6%
R180681
 
12.1%
U174503
 
11.7%
C127641
 
8.6%
E93404
 
6.3%
T87226
 
5.8%
M63795
 
4.3%
S29609
 
2.0%
V29609
 
2.0%
N29609
 
2.0%
Other values (3)69210
 
4.6%
Lowercase Letter
ValueCountFrequency (%)
t22348
25.0%
r22348
25.0%
o22348
25.0%
s22348
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1580517
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A605838
38.3%
R180681
 
11.4%
U174503
 
11.0%
C127641
 
8.1%
E93404
 
5.9%
T87226
 
5.5%
M63795
 
4.0%
S29609
 
1.9%
V29609
 
1.9%
N29609
 
1.9%
Other values (7)158602
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1580517
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A605838
38.3%
R180681
 
11.4%
U174503
 
11.0%
C127641
 
8.1%
E93404
 
5.9%
T87226
 
5.5%
M63795
 
4.0%
S29609
 
1.9%
V29609
 
1.9%
N29609
 
1.9%
Other values (7)158602
 
10.0%

codeudor
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
SIN CODEUDOR
239311 
CON CODEUDOR
27513 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters3201888
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSIN CODEUDOR
2nd rowCON CODEUDOR
3rd rowCON CODEUDOR
4th rowCON CODEUDOR
5th rowCON CODEUDOR

Common Values

ValueCountFrequency (%)
SIN CODEUDOR239311
89.7%
CON CODEUDOR27513
 
10.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
codeudor266824
50.0%
sin239311
44.8%
con27513
 
5.2%

Most occurring characters

ValueCountFrequency (%)
O561161
17.5%
D533648
16.7%
C294337
9.2%
N266824
8.3%
266824
8.3%
E266824
8.3%
U266824
8.3%
R266824
8.3%
S239311
7.5%
I239311
7.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter2935064
91.7%
Space Separator266824
 
8.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O561161
19.1%
D533648
18.2%
C294337
10.0%
N266824
9.1%
E266824
9.1%
U266824
9.1%
R266824
9.1%
S239311
8.2%
I239311
8.2%
Space Separator
ValueCountFrequency (%)
266824
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2935064
91.7%
Common266824
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
O561161
19.1%
D533648
18.2%
C294337
10.0%
N266824
9.1%
E266824
9.1%
U266824
9.1%
R266824
9.1%
S239311
8.2%
I239311
8.2%
Common
ValueCountFrequency (%)
266824
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3201888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O561161
17.5%
D533648
16.7%
C294337
9.2%
N266824
8.3%
266824
8.3%
E266824
8.3%
U266824
8.3%
R266824
8.3%
S239311
7.5%
I239311
7.5%

sector
Categorical

Distinct2
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Memory size2.0 MiB
PRIVADO
243930 
PUBLICO
 
22876

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters1867642
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRIVADO
2nd rowPRIVADO
3rd rowPRIVADO
4th rowPRIVADO
5th rowPRIVADO

Common Values

ValueCountFrequency (%)
PRIVADO243930
91.4%
PUBLICO22876
 
8.6%
(Missing)18
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
privado243930
91.4%
publico22876
 
8.6%

Most occurring characters

ValueCountFrequency (%)
P266806
14.3%
I266806
14.3%
O266806
14.3%
R243930
13.1%
V243930
13.1%
A243930
13.1%
D243930
13.1%
U22876
 
1.2%
B22876
 
1.2%
L22876
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1867642
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P266806
14.3%
I266806
14.3%
O266806
14.3%
R243930
13.1%
V243930
13.1%
A243930
13.1%
D243930
13.1%
U22876
 
1.2%
B22876
 
1.2%
L22876
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Latin1867642
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P266806
14.3%
I266806
14.3%
O266806
14.3%
R243930
13.1%
V243930
13.1%
A243930
13.1%
D243930
13.1%
U22876
 
1.2%
B22876
 
1.2%
L22876
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1867642
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P266806
14.3%
I266806
14.3%
O266806
14.3%
R243930
13.1%
V243930
13.1%
A243930
13.1%
D243930
13.1%
U22876
 
1.2%
B22876
 
1.2%
L22876
 
1.2%

año_credito
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.138233
Minimum1993
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum1993
5-th percentile2001
Q12010
median2014
Q32018
95-th percentile2020
Maximum2021
Range28
Interquartile range (IQR)8

Descriptive statistics

Standard deviation5.723918046
Coefficient of variation (CV)0.002843281177
Kurtosis-0.1314163128
Mean2013.138233
Median Absolute Deviation (MAD)4
Skewness-0.7662398934
Sum537153596
Variance32.76323779
MonotonicityNot monotonic
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
202028739
 
10.8%
201521814
 
8.2%
201920497
 
7.7%
201417957
 
6.7%
201717407
 
6.5%
201817236
 
6.5%
201616822
 
6.3%
201316573
 
6.2%
201214917
 
5.6%
201113474
 
5.0%
Other values (17)81388
30.5%
ValueCountFrequency (%)
19931
 
< 0.1%
19961
 
< 0.1%
19971144
 
0.4%
19982317
0.9%
19993094
1.2%
20003237
1.2%
20013676
1.4%
20023648
1.4%
20033823
1.4%
20044843
1.8%
ValueCountFrequency (%)
20215966
 
2.2%
202028739
10.8%
201920497
7.7%
201817236
6.5%
201717407
6.5%
201616822
6.3%
201521814
8.2%
201417957
6.7%
201316573
6.2%
201214917
5.6%

mes_credito
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.880040776
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.498950777
Coefficient of variation (CV)0.5085654127
Kurtosis-1.227123233
Mean6.880040776
Median Absolute Deviation (MAD)3
Skewness-0.1234824413
Sum1835760
Variance12.24265654
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1229294
11.0%
1025731
9.6%
1124466
9.2%
922504
8.4%
722413
8.4%
621385
8.0%
521295
8.0%
820835
7.8%
320443
7.7%
219931
7.5%
Other values (2)38527
14.4%
ValueCountFrequency (%)
119465
7.3%
219931
7.5%
320443
7.7%
419062
7.1%
521295
8.0%
621385
8.0%
722413
8.4%
820835
7.8%
922504
8.4%
1025731
9.6%
ValueCountFrequency (%)
1229294
11.0%
1124466
9.2%
1025731
9.6%
922504
8.4%
820835
7.8%
722413
8.4%
621385
8.0%
521295
8.0%
419062
7.1%
320443
7.7%

valor_credito_smdlv
Real number (ℝ≥0)

Distinct672
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57.19756094
Minimum0
Maximum700
Zeros2290
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile3
Q115
median37
Q368
95-th percentile200
Maximum700
Range700
Interquartile range (IQR)53

Descriptive statistics

Standard deviation73.32605347
Coefficient of variation (CV)1.281978677
Kurtosis19.22517259
Mean57.19756094
Median Absolute Deviation (MAD)25
Skewness3.593908011
Sum15261682
Variance5376.710118
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
68907
 
3.3%
55947
 
2.2%
75743
 
2.2%
15497
 
2.1%
25289
 
2.0%
155055
 
1.9%
164541
 
1.7%
144367
 
1.6%
83908
 
1.5%
43823
 
1.4%
Other values (662)213747
80.1%
ValueCountFrequency (%)
02290
 
0.9%
15497
2.1%
25289
2.0%
33813
1.4%
43823
1.4%
55947
2.2%
68907
3.3%
75743
2.2%
83908
1.5%
93076
 
1.2%
ValueCountFrequency (%)
700466
0.2%
6981
 
< 0.1%
6972
 
< 0.1%
6961
 
< 0.1%
6952
 
< 0.1%
6943
 
< 0.1%
6932
 
< 0.1%
6921
 
< 0.1%
6912
 
< 0.1%
6901
 
< 0.1%

estado_final
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
PAGADO VENCIDO
94542 
CONTADO
78448 
PAGADO ANTICIPADO
49231 
PAGADO A TIEMPO
21853 
DESCUENTO EN VENTA
14297 
Other values (3)
 
8453

Length

Max length18
Median length14
Mean length12.79885617
Min length7

Characters and Unicode

Total characters3415042
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPAGADO ANTICIPADO
2nd rowDESCUENTO EN VENTA
3rd rowPAGADO A TIEMPO
4th rowPAGADO ANTICIPADO
5th rowPAGADO VENCIDO

Common Values

ValueCountFrequency (%)
PAGADO VENCIDO94542
35.4%
CONTADO78448
29.4%
PAGADO ANTICIPADO49231
18.5%
PAGADO A TIEMPO21853
 
8.2%
DESCUENTO EN VENTA14297
 
5.4%
CARTERA CASTIGADA4028
 
1.5%
OTROS CIERRES2508
 
0.9%
DEVOLUCION1917
 
0.7%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
pagado165626
33.8%
vencido94542
19.3%
contado78448
16.0%
anticipado49231
 
10.1%
a21853
 
4.5%
tiempo21853
 
4.5%
en14297
 
2.9%
venta14297
 
2.9%
descuento14297
 
2.9%
cartera4028
 
0.8%
Other values (4)10961
 
2.2%

Most occurring characters

ValueCountFrequency (%)
A564452
16.5%
O511295
15.0%
D408089
11.9%
N267029
7.8%
C248999
7.3%
P236710
6.9%
I223310
 
6.5%
222609
 
6.5%
T188690
 
5.5%
E184544
 
5.4%
Other values (7)359315
10.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter3192433
93.5%
Space Separator222609
 
6.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A564452
17.7%
O511295
16.0%
D408089
12.8%
N267029
8.4%
C248999
7.8%
P236710
7.4%
I223310
 
7.0%
T188690
 
5.9%
E184544
 
5.8%
G169654
 
5.3%
Other values (6)189661
 
5.9%
Space Separator
ValueCountFrequency (%)
222609
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3192433
93.5%
Common222609
 
6.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A564452
17.7%
O511295
16.0%
D408089
12.8%
N267029
8.4%
C248999
7.8%
P236710
7.4%
I223310
 
7.0%
T188690
 
5.9%
E184544
 
5.8%
G169654
 
5.3%
Other values (6)189661
 
5.9%
Common
ValueCountFrequency (%)
222609
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3415042
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A564452
16.5%
O511295
15.0%
D408089
11.9%
N267029
7.8%
C248999
7.3%
P236710
6.9%
I223310
 
6.5%
222609
 
6.5%
T188690
 
5.5%
E184544
 
5.4%
Other values (7)359315
10.5%

tipo_venta
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
ELECTRODOMESTICOS
265907 
MOTOS
 
886
CONTRATO
 
31

Length

Max length17
Median length17
Mean length16.95910788
Min length5

Characters and Unicode

Total characters4525097
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowELECTRODOMESTICOS
2nd rowELECTRODOMESTICOS
3rd rowELECTRODOMESTICOS
4th rowELECTRODOMESTICOS
5th rowELECTRODOMESTICOS

Common Values

ValueCountFrequency (%)
ELECTRODOMESTICOS265907
99.7%
MOTOS886
 
0.3%
CONTRATO31
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
electrodomesticos265907
99.7%
motos886
 
0.3%
contrato31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
O799555
17.7%
E797721
17.6%
T532762
11.8%
S532700
11.8%
C531845
11.8%
M266793
 
5.9%
R265938
 
5.9%
L265907
 
5.9%
D265907
 
5.9%
I265907
 
5.9%
Other values (2)62
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter4525097
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O799555
17.7%
E797721
17.6%
T532762
11.8%
S532700
11.8%
C531845
11.8%
M266793
 
5.9%
R265938
 
5.9%
L265907
 
5.9%
D265907
 
5.9%
I265907
 
5.9%
Other values (2)62
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin4525097
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O799555
17.7%
E797721
17.6%
T532762
11.8%
S532700
11.8%
C531845
11.8%
M266793
 
5.9%
R265938
 
5.9%
L265907
 
5.9%
D265907
 
5.9%
I265907
 
5.9%
Other values (2)62
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII4525097
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O799555
17.7%
E797721
17.6%
T532762
11.8%
S532700
11.8%
C531845
11.8%
M266793
 
5.9%
R265938
 
5.9%
L265907
 
5.9%
D265907
 
5.9%
I265907
 
5.9%
Other values (2)62
 
< 0.1%

periodo_credito
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
MENSUAL(ES)
210741 
DIARIA(S)
53657 
SEMANAL(ES)
 
1804
QUINCENAL(ES)
 
622

Length

Max length13
Median length11
Mean length10.60247204
Min length9

Characters and Unicode

Total characters2828994
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowQUINCENAL(ES)
2nd rowMENSUAL(ES)
3rd rowMENSUAL(ES)
4th rowMENSUAL(ES)
5th rowMENSUAL(ES)

Common Values

ValueCountFrequency (%)
MENSUAL(ES)210741
79.0%
DIARIA(S)53657
 
20.1%
SEMANAL(ES)1804
 
0.7%
QUINCENAL(ES)622
 
0.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
mensual(es210741
79.0%
diaria(s53657
 
20.1%
semanal(es1804
 
0.7%
quincenal(es622
 
0.2%

Most occurring characters

ValueCountFrequency (%)
S479369
16.9%
E426334
15.1%
A322285
11.4%
(266824
9.4%
)266824
9.4%
N213789
7.6%
L213167
7.5%
M212545
7.5%
U211363
7.5%
I107936
 
3.8%
Other values (4)108558
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter2295346
81.1%
Open Punctuation266824
 
9.4%
Close Punctuation266824
 
9.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S479369
20.9%
E426334
18.6%
A322285
14.0%
N213789
9.3%
L213167
9.3%
M212545
9.3%
U211363
9.2%
I107936
 
4.7%
D53657
 
2.3%
R53657
 
2.3%
Other values (2)1244
 
0.1%
Open Punctuation
ValueCountFrequency (%)
(266824
100.0%
Close Punctuation
ValueCountFrequency (%)
)266824
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2295346
81.1%
Common533648
 
18.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
S479369
20.9%
E426334
18.6%
A322285
14.0%
N213789
9.3%
L213167
9.3%
M212545
9.3%
U211363
9.2%
I107936
 
4.7%
D53657
 
2.3%
R53657
 
2.3%
Other values (2)1244
 
0.1%
Common
ValueCountFrequency (%)
(266824
50.0%
)266824
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2828994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S479369
16.9%
E426334
15.1%
A322285
11.4%
(266824
9.4%
)266824
9.4%
N213789
7.6%
L213167
7.5%
M212545
7.5%
U211363
7.5%
I107936
 
3.8%
Other values (4)108558
 
3.8%

cuotas
Real number (ℝ≥0)

HIGH CORRELATION

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.928803256
Minimum0
Maximum14
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.0 MiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q36
95-th percentile14
Maximum14
Range14
Interquartile range (IQR)5

Descriptive statistics

Standard deviation4.194594669
Coefficient of variation (CV)1.067652004
Kurtosis-0.06608681909
Mean3.928803256
Median Absolute Deviation (MAD)0
Skewness1.172608222
Sum1048299
Variance17.59462443
MonotonicityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1150307
56.3%
1022835
 
8.6%
514883
 
5.6%
1414199
 
5.3%
310935
 
4.1%
210602
 
4.0%
610588
 
4.0%
128202
 
3.1%
48035
 
3.0%
95644
 
2.1%
Other values (5)10594
 
4.0%
ValueCountFrequency (%)
02
 
< 0.1%
1150307
56.3%
210602
 
4.0%
310935
 
4.1%
48035
 
3.0%
514883
 
5.6%
610588
 
4.0%
72008
 
0.8%
84062
 
1.5%
95644
 
2.1%
ValueCountFrequency (%)
1414199
5.3%
13625
 
0.2%
128202
 
3.1%
113897
 
1.5%
1022835
8.6%
95644
 
2.1%
84062
 
1.5%
72008
 
0.8%
610588
4.0%
514883
5.6%

forma_pago
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
CRÉDITO
182518 
CONTADO
78448 
LIBRANZA
 
5858

Length

Max length8
Median length7
Mean length7.021954547
Min length7

Characters and Unicode

Total characters1873626
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCRÉDITO
2nd rowCRÉDITO
3rd rowCRÉDITO
4th rowCRÉDITO
5th rowCRÉDITO

Common Values

ValueCountFrequency (%)
CRÉDITO182518
68.4%
CONTADO78448
29.4%
LIBRANZA5858
 
2.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
crédito182518
68.4%
contado78448
29.4%
libranza5858
 
2.2%

Most occurring characters

ValueCountFrequency (%)
O339414
18.1%
C260966
13.9%
D260966
13.9%
T260966
13.9%
R188376
10.1%
I188376
10.1%
É182518
9.7%
A90164
 
4.8%
N84306
 
4.5%
L5858
 
0.3%
Other values (2)11716
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1873626
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O339414
18.1%
C260966
13.9%
D260966
13.9%
T260966
13.9%
R188376
10.1%
I188376
10.1%
É182518
9.7%
A90164
 
4.8%
N84306
 
4.5%
L5858
 
0.3%
Other values (2)11716
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin1873626
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O339414
18.1%
C260966
13.9%
D260966
13.9%
T260966
13.9%
R188376
10.1%
I188376
10.1%
É182518
9.7%
A90164
 
4.8%
N84306
 
4.5%
L5858
 
0.3%
Other values (2)11716
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1691108
90.3%
Latin 1 Sup182518
 
9.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O339414
20.1%
C260966
15.4%
D260966
15.4%
T260966
15.4%
R188376
11.1%
I188376
11.1%
A90164
 
5.3%
N84306
 
5.0%
L5858
 
0.3%
B5858
 
0.3%
Latin 1 Sup
ValueCountFrequency (%)
É182518
100.0%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Rowprocedenciageneroestado_civiledadmunicipio_residenciamunicipio_nacimientomunicipio_expediciontipo_personatiene_casa_propiasueldo_smdlvotros_ingresos_smdlvmunicipio_creditocodeudorsectoraño_creditomes_creditovalor_credito_smdlvestado_finaltipo_ventaperiodo_creditocuotasforma_pago
01NacionalNaNNaNNaNSARAVENANaNNaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO202012356PAGADO ANTICIPADOELECTRODOMESTICOSQUINCENAL(ES)1CRÉDITO
12NacionalNaNNaN70.0ARAUCAOtrosOtrosNaturalSi170.0NaNARAUCACON CODEUDORPRIVADO2020749DESCUENTO EN VENTAELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
23NacionalNaNNaNNaNARAUQUITAARAUQUITAARAUCANaturalNaNNaNNaNARAUQUITACON CODEUDORPRIVADO20171118PAGADO A TIEMPOELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO
34NacionalNaNNaNNaNARAUCANaNNaNNaturalNaNNaNNaNOtrosCON CODEUDORPRIVADO20191065PAGADO ANTICIPADOELECTRODOMESTICOSMENSUAL(ES)14CRÉDITO
45NacionalNaNNaNNaNARAUQUITAARAUQUITAARAUQUITANaturalNaNNaNNaNARAUQUITACON CODEUDORPRIVADO2018561PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)12CRÉDITO
56NacionalNaNNaNNaNARAUQUITAARAUQUITAARAUQUITANaturalNaNNaNNaNARAUQUITACON CODEUDORPRIVADO20196114PAGADO ANTICIPADOELECTRODOMESTICOSMENSUAL(ES)12CRÉDITO
67NacionalNaNNaNNaNARAUCANaNNaNNaturalNaNNaNNaNSARAVENACON CODEUDORPRIVADO201911349PAGADO VENCIDOMOTOSMENSUAL(ES)12CRÉDITO
78NacionalNaNNaNNaNSARAVENAARAUCAARAUCANaturalNaNNaNNaNSARAVENACON CODEUDORPRIVADO20071151PAGADO ANTICIPADOELECTRODOMESTICOSMENSUAL(ES)6CRÉDITO
89NacionalNaNNaNNaNARAUCANaNNaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO201912284PAGADO VENCIDOMOTOSMENSUAL(ES)2CRÉDITO
910NacionalNaNNaNNaNARAUCANaNNaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO201911124PAGADO VENCIDOELECTRODOMESTICOSMENSUAL(ES)12CRÉDITO

Last rows

Rowprocedenciageneroestado_civiledadmunicipio_residenciamunicipio_nacimientomunicipio_expediciontipo_personatiene_casa_propiasueldo_smdlvotros_ingresos_smdlvmunicipio_creditocodeudorsectoraño_creditomes_creditovalor_credito_smdlvestado_finaltipo_ventaperiodo_creditocuotasforma_pago
266814266815NacionalNaNNaNNaNSARAVENAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO2017181DESCUENTO EN VENTAELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO
266815266816NacionalNaNNaNNaNSARAVENAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO2015142PAGADO ANTICIPADOELECTRODOMESTICOSSEMANAL(ES)8CRÉDITO
266816266817NacionalNaNNaNNaNARAUCAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO2017814PAGADO A TIEMPOELECTRODOMESTICOSSEMANAL(ES)12CRÉDITO
266817266818NacionalNaNNaNNaNSARAVENAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO2016471DESCUENTO EN VENTAELECTRODOMESTICOSSEMANAL(ES)12CRÉDITO
266818266819NacionalNaNNaNNaNSARAVENAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO20171116DESCUENTO EN VENTAELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO
266819266820NacionalNaNNaNNaNSARAVENAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO2011850PAGADO VENCIDOELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO
266820266821NacionalNaNNaNNaNSARAVENAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO2016844PAGADO VENCIDOELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO
266821266822NacionalNaNNaNNaNSARAVENAARAUCANaNNaturalNaNNaNNaNSARAVENASIN CODEUDORPRIVADO20171052DESCUENTO EN VENTAELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO
266822266823NacionalNaNNaNNaNARAUCAARAUCANaNNaturalNaNNaNNaNARAUCASIN CODEUDORPRIVADO200737CONTADOELECTRODOMESTICOSSEMANAL(ES)1CONTADO
266823266824NacionalNaNNaNNaNTAMEARAUCANaNNaturalNaNNaNNaNTAMESIN CODEUDORPRIVADO20179117PAGADO VENCIDOELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO